Search CORE

19 research outputs found

Intermediate Representations for Controllers in Chip Generators

Author: Danowitz Andrew
Horowitz Mark
Kelley Kyle
Richardon Stephen
Stevenson Pete
Wachs Megan
Publication venue: DigitalCommons@CalPoly
Publication date: 14/03/2011
Field of study

Creating parameterized “chip generators” has been proposed as one way to decrease chip NRE costs. While many approaches are available for creating or generating flexible data path elements, the design of flexible controllers is more problematic. The most common approach is to create a microcoded engine as the controller, which offers flexibility through programmable table-based lookup functions. This paper shows that after “programming” the hardware for the desired application, or applications, these flexible controller designs can be easily converted to efficient fixed (or less programmable) solutions using partial evaluation capabilities that are already present in most synthesis tools

Crossref

DigitalCommons@CalPoly

Understanding sources of inefficiency in general-purpose chips

Author: Alex Solomatnikov
Benjamin C. Lee
Chen C.-Y.
Chen T-C.
Christos Kozyrakis
Iverson V.
Kathail V.
Mark Horowitz
McCloud S.
Megan Wachs
Omid Azizi
Rehan Hameed
Shojania H.
Stephen Richardson
Wajahat Qadeer
Yin P
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Long-term modification of cortical synapses improves sensory perception

Author: A Reed
AD Huberman
AL Dorrn
Alison J Barker
Ana Raquel O Martins
BA Wright
BB Averbeck
Bryan A Seybold
CC Rumsey
CD Meliza
Christoph E Schreiner
CQ Ye
Daniel B Polley
DB Polley
DE Feldman
DE McLin III
DH Hubel
DV Buonomano
E de Villers-Sidani
GB Smith
Hannah Bernstein
Ioana Carcea
J Fritz
J Fritz
J-M Edeline
JA Hirsch
JC Dahmen
JM Greuel
JS Bakin
Kexin Yuan
L Zaborszky
LC Katz
LF Abbott
M Brown
M Elhilali
M Goard
M Hübener
ME Hasselmo
Megan Wachs
MG Lee
MG Shuler
Michael M Merzenich
MR Cohen
MR Deweese
Natalya Zaika
NS Desai
P Fries
Philip A Levis
R Metherate
RC Froemke
RC Liu
Robert C Froemke
S Royer
SC Lin
SJ Martin
SK Talwar
T Toyoizumi
V Jacob
WP Tanner
Y Dan
Y Frégnac
Y Li
YK Han
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Synapses and receptive fields of the cerebral cortex are plastic. However, changes to specific inputs must be coordinated within neural networks to ensure that excitability and feature selectivity are appropriately configured for perception of the sensory environment. Long-lasting enhancements and decrements to rat primary auditory cortical excitatory synaptic strength were induced by pairing acoustic stimuli with activation of the nucleus basalis neuromodulatory system. Here we report that these synaptic modifications were approximately balanced across individual receptive fields, conserving mean excitation while reducing overall response variability. Decreased response variability should increase detection and recognition of near-threshold or previously imperceptible stimuli, as we found in behaving animals. Thus, modification of cortical inputs leads to wide-scale synaptic changes, which are related to improved sensory perception and enhanced behavioral performance

Crossref

Harvard University - DASH

PubMed Central

eScholarship - University of California

MPG.PuRe

Reconstructing a 3d line from a single catadioptric image

Author: Douglas Lanman
Gabriel Taubin Fern
Megan Wachs
O Cukierman
Publication venue
Publication date: 01/01/2006
Field of study

This paper demonstrates that, for axial non-central optical systems, the equation of a 3D line can be estimated using only four points extracted from a single image of the line. This result, which is a direct consequence of the lack of vantage point, follows from a classic result in enumerative geometry: there are exactly two lines in 3-space which intersect four given lines in general position. We present a simple algorithm to reconstruct the equation of a 3D line from four image points. This algorithm is based on computing the Singular Value Decomposition (SVD) of the matrix of Plücker coordinates of the four corresponding rays. We evaluate the conditions for which the reconstruction fails, such as when the four rays are nearly coplanar. Preliminary experimental results using a spherical catadioptric camera are presented. We conclude by discussing the limitations imposed by poor calibration and numerical errors on the proposed reconstruction algorithm.

CiteSeerX

Crossref

Abstract Opening the Sensornet Black Box

Author: Jung Il Choi
Jung Woo Lee
Megan Wachs
Philip Levis
Publication venue
Publication date: 01/04/2008
Field of study

We argue that the principal cause of sensornet deployment and development difficulty is an inability to observe a network’s internal operation. We further argue that this lack of visibility is due to the activity and resource constraints enforced by limited energy. We present the Mote Network (MNet) architecture, which elevates visibility to be its dominant design principle. We propose a quantitative metric for network visibility and explain why network isolation and fairness are critical concerns. We describe the Fair Waiting Protocol (FWP), MNet’s single-hop protocol and show how its fairness and isolation can improve throughput and efficiency. We present the Pull Collection Protocol as a case study in designing multihop protocols in the architecture.

CiteSeerX

Verification of Chip Multiprocessor Memory Systems Using A Relaxed Scoreboard

Author: Alex Solomatnikov
Amin Firoozshahian
Mark Horowitz
Megan Wachs
Ofer Shacham
Stephen Richardson
Publication venue
Publication date: 01/01/2008
Field of study

Verification of chip multiprocessor memory systems remains challenging. While formal methods have been used to validate protocols, simulation is still the dominant method used to validate memory system implementation. Having a memory scoreboard, a high-level model of the memory, greatly aids simulation based validation, but accurate scoreboards are complex to create since often they depend not only on the memory and consistency model but also on its specific implementation. This paper describes a methodology of using a relaxed scoreboard, which greatly reduces the complexity of creating these memory models. The relaxed scoreboard tracks the operations of the system to maintain a set of values that could possibly be valid for each memory location. By allowing multiple possible values, the model used in the scoreboard is only loosely coupled with the specific design, which decouples the construction of the checker from the implementation, allowing the checker to be used early in the design and to be built up incrementally, and greatly reduces the scoreboard design effort. We demonstrate the use of the relaxed scoreboard in verifying RTL implementations of two different memory models, Transactional Coherency and Consistency (TCC) and Relaxed Consistency, for up to 32 processors. The resulting checker has a performance slowdown of 19 % for checking Relaxed Consistency, and less than 30% for TCC, allowing it to be used in all simulation runs. 1

CiteSeerX

Crossref

B.7.2 [Hardware]: Integrated Circuits – Design Aids

Author: Alex Solomatnikov
Amin Firoozshahian
Kyle Kelley
Mark Horowitz
Megan Wachs
Ofer Shacham
Rehan Hameed
Wajahat Qadeer
Publication venue
Publication date
Field of study

The drive for low-power, high performance computation coupled with the extremely high design costs for ASIC designs, has driven a number of designers to try to create a flexible, universal computing platform that will supersede the microprocessor. We argue that these flexible, general computing chips are trying to accomplish more than is commercially needed. Since design NRE costs are an order of magnitude larger than fabrication NRE costs, a two-step design system seems attractive. First, the users configure/program a flexible computing framework to run their application with the desired performance. Then, the system “compiles ” the program and configuration, tailoring the original framework to create a chip that is optimized toward the desired set of applications. Thus the user gets the reduced development costs of using a flexible solution with the efficiency of a custom chip

CiteSeerX

Understanding sources of inefficiency in general-purpose chips

Author: Alex Solomatnikov
Benjamin C. Lee
Christos Kozyrakis
Hicamp Systems
Mark Horowitz
Megan Wachs
Omid Azizi
Rehan Hameed
Stephen Richardson
Wajahat Qadeer
Publication venue
Publication date: 01/01/2010
Field of study

Due to their high volume, general-purpose processors, and now chip multiprocessors (CMPs), are much more cost effective than ASICs, but lag significantly in terms of performance and energy efficiency. This paper explores the sources of these performance and energy overheads in general-purpose processing systems by quantifying the overheads of a 720p HD H.264 encoder running on a general-purpose CMP system. It then explores methods to eliminate these overheads by transforming the CPU into a specialized system for H.264 encoding. We evaluate the gains from customizations useful to broad classes of algorithms, such as SIMD units, as well as those specific to particular computation, such as customized storage and functional units. The ASIC is 500x more energy efficient than our original fourprocessor CMP. Broadly, applicable optimizations improve performance by 10x and energy by 7x. However, the very low energy costs of actual core ops (100s fJ in 90nm) mean that over 90 % of the energy used in these solutions is still “overhead”. Achieving ASIC-like performance and efficiency requires algorithm-specific optimizations. For each sub-algorithm of H.264, we create a large, specialized functional unit that is capable of executing 100s of operations per instruction. This improves performance and energy by an additional 25x and the final customized CMP matches an ASIC solution’s performance within 3x of its energy and within comparable area

CiteSeerX